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(54) Disk array device and method for controlling the same 



(57) The disk array device includes a disk control 
device (20) connected to a central processing unit (10) 
and a plurality of disk drives (300) composing disk 
arrays under the control of said disk control device (20). 
The disk control device (20) includes a redundant data 
generator (130), a difference data generator (140), and 
a redundant data generation method selecting function 
(37). The disk array device selects a proper redundant 
data generating method from a method of read and 
modify and a method of all stripes, both of which are 
executed to generate redundant data by the disk control 



device (20) according to an access pattern from a host, 
a load state of the disk drive (300), and a failure, and a 
method of a generation in a drive and a method of differ- 
ence, both of which are executed to generate the redun- 
dant data on the disk drive (300) for saving the 
redundant data, for the purpose of reducing an over- 
head accompanied with generation of the redundant 
data and improving reliability of generating the redun- 
dant data. 
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Description 

BACKGROUND OF THE INVENTION 
5 Field of the Invention 

tir .^Tr? i " ven,i0 " rel t tes to a disk arra V technique and a technique for controlling the disk array, and more par- 
Sng 6 V6,y 6nhanCin9 etfidenC/ reHabi,ity ° f 3 pr ° CeSS for redundant da * a generated in 

Description of the Related Art 

Devid A. Patersoa et al. have reported in UCB/CSD/87.391 (December 1987) of University of California's Report 

lnrty rt £rT,«KT^V a l a ^ °' 3 St ° ra9e ™« meth0d takes the <* Paring a plural oi 

storage deuces fd.skdr.ves). d.v,d.ng the I/O data from a host computer into the corresponding number of parts to the 
number of me storage dev.ces. recording and reproducing the divided data into and from the storage devices and 
recovering fault data from a norma, storage device(s) if a temporary or permanent failure takes place in so^e of the 
storage ^ea This reports said that the following two methods are provided for generating redundant data 
to .Zl'If^Jf 9 e " eratin 9 tne redundant data, which is referred to as a method of read and modify, is arranged 
to use write data from a host computer, previous data (data before update) stored in a storage device where the write 

d a t iiL st : re tTr ious da,a (data before upda,e) stored in a device wne - tne oZ^SEz 

data .s to be stored for the purpose of generating the redundant data. In this method, assuming that the divisional 

Z"Z 22 d3ta ^ 3 ' ^.f dSVflCeS ' thS ' /OS for Writi "9 take P ,ace a + 1 «™ ^ «"e vLTeJ- 
Z In + T S 35 We "- 17,6 10,31 "° tim6S reaCh 2 x a + 2 - ,f the stora 9 e de ™e does not contain the redun- 
dant data, the access tunes are a. It means that the use of the redundant data results in increasing the VO times a 

The seco^ method for generating the redundant data, which is referred to as a method of all stripes, is arranged 
to use the <*v.ded parts of wnte data from the host computer and data read from the storage devices except those for 
sav,ng the redundant data belonging to the ECC group that does not save the write data from the hos fo the 

30 purpose of generating the redundant data. * ww ' w me 

™JI2 "J??* aSSUmin9 th3t the diviSi ° nal tim6S ° f the write data is a and ,he number of the storage devices 
composing an ECC group, except those for saving the redundant data, is b. the times of l/Os to and from the storage 

izzr. if 5 ? ; \: he ? in the vo times ,or readin9 area and ,he ,/o times *» cc^^ss. 

3S SS S V/m storage devices that do not contain the redundant data, since the access times are a. the 

35 use of the redundant data results in increasing the I/O times by b + 1 - a 

, Apa ^ , . ,rom i ,r 'e foregoing methods, the method for generating the redundant data in the storage device has been 
EESi" h Pa » ent , N V- 613 ' 088 . herein the redundant data is generated in the storage device proved S 
two heads for read and wnte. Concretely, the read head and the write head are fixed on a common actuator so mat the 

« 22 E?r* p ^ d tL belbr L update and ,hen write head writes on the same area ^tZ^LJ. 

40 ated from the parity data before update unless the disk spins once. 

foregoing two methods for generating the redundant data, that is, the method of read and modify and tho 
method of all stnpes, have the increased number of l/Os when writing the data because of the usVofTe ZSn£i 
out^ndTnTl 3t d6ViCe ^ r6dUndanCy iS in,eri0r in Pe^mance to the disk contrd devfce wrth 

<s l?„r fun', * T n "° nal ^ C ° n,r °' d6ViCe Wth redundancy selectively employs the method with a 
S ; ? h /0 ^ t0 "f ,r ° m the St0ra9e d6ViCe for ,he P ur P° se 01 Cueing the UOs to and from the storage 
device ,n writing data^ Th,s selection makes it possible to reduce the burden on the storage device and thereby improve 

time's ESTiSS *L jn CaSS ° f 3 Mb * 1)72 • <he me,h0d 01 a " ***■ »«■ a smal| er increased 

Z h« 9 a " the D metnod of read and modif * whi 'e in the case of a < (b - 1)/2 . the method of read 

» ^nn ^ ^ 3 k ^ Z ' nCreaSe By USm9 thiS ' H the data ,en9th of the wri,e ^ta from the host computer is in the 
mXJlf < (b ' \* T ^ mp,e ' in <he 0356 ° f a transaction P roc ess. the disk control device arranged to use Z 
9enerat.ng the redundant data operates to generate a parity through the effect of the method of read and 

0^^?^ ? Z St0ra l e beC ° meS four 3t minimum when a = 1 • which is tn e smallest load burdened 
" ° ther h WOrdS ' hOWever ' in SUCh a case of ,he transaction process, it means that the critical po^S 
of performance takes Place when a = 1 . The performance cannot be improved further unless the method of process a 
55 this time is reconsidered. The problem about the method of read and modify is essentially based on thefaa that Lo 
/Os are .ssued to the storage device for saving the redundant data and a mechanical ovemead such » 
the head and sp.nmng on standby is buidened at each I/O time. The mechanica. overhead is a great ZEZSZ , Mie 
disk control device that is in electric operation. on Ine 



h'p.ce 3 of ? \ 
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The method disclosed in a U.S. Patent No. 5,613.088 makes it possible to generate the redundant data in the stor- 
age device provided with two heads for read and write, thereby reducing the spinning times of the drive on standby. In 
case this method is expanded to a general storage device provided with a single head, the resulting method takes the 
steps of transferring the updated data and the data before update read from the storage device to the storage for saving 
5 the redundant data and enabling the storage device for saving the redundant data to read the redundant data before 
update and generate the redundant data from the transferred updated data, the data before update, and the redundant 
data before update. This method is referred to as a method of a generation in a drive. In this method, when the head is 
positioned to read the redundant data before update after the spinning on standby and reaches the next writing position, 
the write is started. This operation makes it possible to avoid the spinning on standby during the writing interval and 
w merely needs one movement of the head and one standby spin. As a result, if the length of the data from the host com- 
puter is short, the processing speed of the control device can be improved further. 

However, the method of a generation in a drive cannot necessarily offer the essential effects depending on an 
access pattern from the host computer, the load burdened on the storage device, and the like. 

That is, if the length of the generated redundant data is longer than one spin of the disk, that is. the length of the 
is write data from the host computer is longer than one spin of the disk, the method of a generation in a drive is required 
for the disk to spin on standby during the interval of reading the redundant data before update and writing the updated 
redundant data. However, the spinning on standby results in increasing an occupying time of the drive for saving the 
redundant data, thereby increasing a response time of the drive for saving the redundant data. If, therefore, the length 
of the redundant data is one spin, the method of a generation in a drive has a problem that the disk array device has a 
20 lower response time. 

The method of a generation in a drive is arranged to increase the data before update to be read out as the divisional 
number of the write data from the host computer becomes more, thereby increasing the load burdened on the storage 
device. Hence, if the divisional number of the write data is great, the method of a generation in a drive disadvanta- 
geous^ puts the throughput of the disk array device into a lower value. 

25 The use of the method of a generation in a drive makes it possible to increase an occupying time of the drive for 
saving the redundant data at each spin as compared with the method of read and modify, thereby increasing the load 
burdened on the drive for saving the redundant data in the highly multiplexing and high load environment. Hence, the 
method of a generation in a drive may enhance a probability that the drive for saving the redundant data is in use, 
thereby lowering the throughput of the drive. 

30 When the write data is transferred from the host computer to the disk control device together with an explicit spec- 
ification of consecutive pieces of data, the method of a generation in a drive operates to immediately generate the 
redundant data on the transferred write data. As a result, when the succeeding write data is transferred from the host 
computer, the method of all stripes may lose a chance of generating the redundant data in correspondence to the first 
write data. Hence, if the method of a generation in a drive cannot use the method of all stripes, this disadvantageous^ 

35 lowers the efficiency of generating the redundant data, thereby degrading the throughput of the disk array device. 

When the method of a generation in a drive uses the generation of the redundant data, the generation of the redun- 
dant data becomes unsuccessful because of any failure such as failure caused in reading the redundant data before 
update. In this case, the redundancy of the ECC group may be lost at once. 

40 SUMMARY OF THE INVENTION 

It is an object of the present invention to avoid the spinning on standby caused if the data length of the write data 
from an upper system is longer than one spin and improve a response time of the disk array device caused if the disk 
drive composing the disk array generates the redundant data. 
45 It is a further object of the present invention to improve a throughput of the disk array device by selecting the most 
approximate method for generating redundant data so that the necessary reading number of the data before update is 
made minimal according to the divisional number of the write data received by the upper system. 

It is a yet further object of the present invention to improve a response time of the disk array device by reducing an 
occupying time for one spin of the disk drive in the case of generating the redundant data through the effect of the disk 
so drive composing the disk array. 

It is another object of the present invention to improve a throughput of a disk array device by enhancing the effi- 
ciency of generating the redundant data in association with the process for write data according to an access pattern 
required by the upper system. 

It is still another object of the present invention to enhance reliability of a process for generating the redundant data 
55 through the effect of the disk drive composing the disk array. 

According to the invention, a disk array device having a plurality of disk drives composing a disk array and a disk 
control device for controlling those disk drives includes a plurality of means for generating the redundant data in differ- 
ent ways, and a selective control logic for selectively executing at least one of the plurality of means for generating 
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redundant data. 

The disk array device including a plurality of disk drives composing a disk array and a disk control device for con- 
trolling the disk drives is dynamically switched from the generation of the redundant data in the disk control device to 
the generation of the redundant data inside of the disk drive according to an operating status. 

Concretely, as an example, the disk array device according to the invention includes the following components. 

That is, the disk array device composing a plurality of disk drivers, which is arranged to set a logic group of a partial 
set of the disk drives and save the redundant data in part of the logic group for the purpose of recovering the fault data 
from the normal disk drives when some of the disk drives are disabled by temporary or permanent failure, provides 
means for generating the redundant data in each of the disk drives. 

The disk control device includes a first redundant data generating circuit inside of a control device for generating 
new redundant data from the divided write data received from the upper system, and the previous data to be updated 
by the divided write data, and the redundant data of the data group of the divided write data, a second redundant data 
generating circuit inside of the control device for generating new redundant data of the data group from the data that is 
not updated by the divided write data contained in the data group, and a selective control circuit for selecting a differ- 
ence data generating circuit for generating difference data from the divided write data received from the upper system 
and the previous data updated by the divided write data and the circuit for generating the redundant data. 

Further, the disk control device provides means for determining a data length of the write data received from the 
upper system, means for detecting a load of the drive for saving the redundant data, means for determining if the trans- 
fer of consecutive pieces of data from the upper system is explicitly specified, and means for detecting if the generation 
of the redundant data inside of the disk drive for saving the redundant data is failed. The selective control circuit oper- 
ates to select a proper method for generating the redundant data. 

The disk array device and the method for controlling the disk array device as described above are served as follows 
an an example. 

The disk array device operates to determine the data length of the write data sent from the upper system and gen- 
erate the redundant data in the disk drive if the data length is determined to be shorter than one spin of the disk. Hence, 
the disk array device operates to suppress the spinning on standby caused inside of the disk drive for saving the redun- 
dant data, thereby improving a throughput of the disk array device. 

If the data length of the write data sent from the upper system is determined to be longer than one spin of the disk, 
the difference data between the divided write data and the data before update stored on the disk drive for saving the 
divided data is transferred onto the disk drive for saving the redundant data. The disk drive for saving the redundant 
data operates to generate the redundant data from the difference data and the redundant data before update, thereby 
suppressing the spinning on standby caused in the disk drive for saving the redundant data and improving the through- 
- put of the disk array device accordingly. 

The method for controlling the disk array device is executed to determine a load burdened on the disk drive for sav- 
ing the redundant data, generate the redundant data in another disk control device without having to actuate the 
method of a generation in a drive if the load is determined to be greater than or equal to a given value, for the purpose 
of distributing the load associated with the generation of the redundant data. In the highly multiplexing and high load 
environment, by suppressing the increase of the load to be put on the disk drive for saving the redundant data, it is pos- 
sible to suppress the probability that the disk drive for saving the redundant data is in use and thereby improve the 
throughput of the disk array device. 

The method for controlling the disk array device is executed to determine if the transfer of consecutive pieces of 
data from the upper system to the disk control device is explicitly specified and generate the redundant data in the block 
a short time after the write data reaches a sufficient length without immediately generating the redundant data, if the 
explicit transfer of the consecutive data is specified. This enables to improve the efficiency of generating the redundant 
data and the throughput of the disk array device. 

When the method of a generation in a drive fails in generating the redundant data, the method for generating the 
redundant data is switched to another method. This makes it possible to increase the changes of recovering the failure 
and thereby improving reliability of the disk array device. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagram showing an arrangement of an information processing system including a disk array device 
according to the present invention; 

Fig. 2 is a view showing an internal arrangement of a disk drive used in the disk array device according to the 
present invention; 

Fig. 3 is a view showing an example of mapping data to be given to and received from an upper system in the disk 
drive used in the disk array device according to the present invention; 

Fig. 4 is a diagram showing an arrangement of hardware of an information processing system including a disk array 
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device according to the present invention; 

Fig 5 is a flowchart showing an example of a process for selecting a method for generating redundant data in a 
disk array device according to the present invention; 

Fi9 ;2ilf ? nC6pt Vi6W Sh0Win9 3 f ' OW of data executed in generating redundant data through the effect of a 
method of all stripes in th e disk array device according to th e present invention • 

R9 ;JLi S J C ° nCept ViSW Sh0wi " 9 3 flow of da,a ^e^ed in generating redundant data through the effect of a 
method of a generation in a drive in the disk array device according to the present invention- 

F ' 9 JL?J C °!! CeP ! Vi6W ShOWing 3 ' l0W 0f data executed in generating redundant data through the effect of a 
method of read and modify in the disk array device according to the present invention ■ 

' Fi9 ;uli S -f £ nC6Pt Vi6W Sh0Win9 3 ,IOW 01 data ^ecuted in generating redundant data through the effect of a 
method of difference in a disk array device according to an embodiment of the present invention and 
Fig. 10 .s a view showing an arrangement of a cache directory in the disk control device included in the disk array 
device according to an embodiment of the present invention. 

f DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Hereafter, an embodiment of the invention will be described in detail with reference to the appended drawings 
F.g r 1 is a concept showing an arrangement of an information processing system including a disk array device 
accord.ngto an embodiment of the present invention. The information processing system according to this embodiment 
;s arranged to have a central processing device 10 (referred to as a host 1 0) and a disk array device connected thereto 
The disk array device is configured of a disk control device 20 and seven disk drives 300 (300a to 300g) to be operated 

° f 25 0theri e3Ch °l WhiCh diSk dnVe may be a ma9netic disk unit - ,or example. These seven disk drives 
300 compose an ECC group (on which unit data is recovered when failure takes place. The host 10 is coupled with the 
disk control device 20 through a channel bus 60. The disk control device 20 is connected to each disk drive 300 through 
the corresponding drive bus 70 so that each disk drive 300 may be operated independently of each other 

The disk control device 20 is arranged to have a host l/F 40. a data divider/reproducer 80. a microprocessor 30 a 
cache memory 1 10. a cache directory 120. a redundant data generator 130. a difference data generator 140 and a 
drive controller 50. The host l/F 40 and the data divider/reproducer 80 is coupled to a channel bus 60. the cache mem- 
ory 1 10 and the microprocessor 30 through signal lines. The microprocessor 30 is coupled to the cache memory 1 10 
S Tr M red " ndarrt da,a Senerator 130. the difference data generator 140. and the drive controller 

50 through signal lines. The cache memory 1 10. the cache directory 120. the redundant data generator 130 the differ- 
ence data generator 140. and the drive controller 50 are controlled by a cache memory reference function 31 a cache 
directory reference function 32, a drive state reference function 33. a redundant data generator control function 34 a 
drfference data generator control function 35, and a drive controller control function 36. all of which are executed by 
micro programs built in the microprocessor 30. «««.uieu oy 

The microprocessor 30 includes a redundant data generating method selecting function 37 served as means for 
determining the method for generating the redundant data, a sequential access mode determining function 38 served 
as means for checking if the sequential access from the host 1 0 is specified, and a mapping function 39 served for cal- 
culating a wnte position on the actual disk drive 300 from the write data from the host 1 0. Those functions are executed 
Dy the micro programs built in the microprocessor 30 itself. 

or ™?J? Che mem °7J 10> * e cache directory 120 - the redundant data generator 130. and the difference data gen- 

2E. ~Z e *Tl P I™!? S L 9nal ' ineS - ?he 080,18 mem0ry 1 10 is cou P led to th e drive controller 50 through a sig- 
nal line so that data is allowed to be transferred between the cache memory 1 10 and the drive controller 50. The cache 
memory 110 is divided into a write plane 1 1 1 on which the write data from the host 10 is saved and a read plane 112 

i« . i?f 3 ^ ^ diSk dfiVe 300 iS 8av8d - Each of tne read P' ane 1 12 end the write plane 1 1 1 contains 
slots 1 1 3 to 1 1 8 on which each plane is divided into sectors. 

~JSmL° I 5 . 3 COncep ' Vie * showin9 30 errangement of a cache directory 120 according to this embodiment. In this 
embodiment, the cache directory 120 contains several kinds of information to be set thereto. Those kinds of information 
include cache managing information 121 for managing data of the cache memory 1 10. data length information 122 for 
saving a data length received from the host 1 0. access pattern information 123 for storing the access pattern in the case 
of specifying an access pattern such as a random access and a sequential access from the host 10. and pending flag 
information 1 24 for saving a pending flag for holding a data write process containing generation of redundant data untS 
a series of sequential accesses are terminated if the access pattern is the sequential pattern 

300 2'ii«^£, V lr fc f hOWi . n0 ,.- n int6rnal arran9ement 01 toe disk drive used in this embodiment. The disk drive 
300 includes a disk l/F 310 for controlling transfer of information between the disk drive 300 and the outside through the 

i3.if,2 3 hT I 6 ', ° f ° r temporari, y ho,din 9 data received from the outside and data read from an inside disk 
ZTm Z ^ mechan,sm 340 f(j r controlling a positioning operation of a head (not shown) with respect to 

the disk medium 350. and a microprocessor 320 for controlling all of those components. The disk l/F 310 the drive 
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buffer 330 and the disk control mechanism 340 are coupled to the microprocessor 320 through signal lines and are con- 
trolled by the microprocessor 320. 

In this embodiment, the microprocessor 320 of each disk drive 300 has a function of generating new redundant 
data from the data received from the outside and the redundant data before update saved in the disk medium 350 inside 

5 of the disk drive 300 itself and then saving the new redundant data in the disk medium 350 inside of the disk drive 300. 
This function may be executed by the microprocessor 320 or any leased hardware except the microprocessor 320. 

Fig. 3 is a concept view showing an example of mapping I/O data to be given to or received from the host 10 onto 
the disk medium 350 located inside of the disk drive. In this embodiment, the data recording area of the disk medium 
350 of each disk drive 300 is logically divided into a plurality of unit areas. The concatenation of each unit area of the 

10 disk drives 300a to 3O0g composes a data group containing at least one piece of redundant data. 

That is. the redundant data P000 (parity) is located on the unit area of the most right disk drive 300g of the first col- 
umn. On the second column or later, the redundant data is shifted by one to the left hand of the storage position of the 
parity on the previous column. If the storage position of the redundant data of one previous column is located on the 
leftmost disk drive 300a, the redundant data P001~ is located on the most right disk drive 300g. The divisions D000 to 

15 D029 of the write data from the host 10 are sequentially mapped from the disk drive located immediately in the right 
hand of the redundant data or the leftmost disk drive 300 if the redundant data is located in the most right hand. The 
redundant data of each column is generated to be equal to the exclusive OR of the data of each column, for example, 
D000 to D005 and then is saved. If one failure takes place in one data piece of each row, the fault data can be recovered 
by the exclusive OR of the remaining data inside of the column with the redundant data. 

20 In addition, the redundant data for error recovery may be an exclusive OR of a group of plural divided data as well 
as any code such as a hamming code. 

Fig. 4 is a concept view showing the arrangement of the disk array device according to this embodiment. The chan- 
nel bus 60 shown in Fig. 1 corresponds to 260-1 to 260-8. The host l/F 40 and the data divider/reproducer 80 shown in 
Fig. 1 correspond to host adapters 231-1 and 231-2. The microprocessor 30, the drive controller 50, the redundant data 

25 generator 130, and the difference data generator 140 shown in Fig. 1 correspond to disk adapters 233-1 to 233-4. The 
cache memory 1 10 shown in Fig. 1 corresponds to cache memories 232-1 to 232-2. The cache directory 120 shown in 
Fig. 1 corresponds to shared memories 234-1 to 234-2. The drive path 70 shown in Fig. 1 corresponds to 270-1 to 270- 
16. 

The host adapters 231-1 to 232-2, the cache memories 232-1 to 232-2, the disk adapters 233-1 to 233-4, and the 
30 shared memories 234-1 to 234-2 are connected to each other through doubled data transfer buses 237-1 to 237-2. 

Under the control of the disk control device 20, the inside of each of the disk drive boxes 241-1 to 241 -2 is con- 
nected to a storage device 240 arranged to accommodate the disk drives 242-1 to 242-32 and the disk drives 242-33 
to 242-64. 

The host adapters 231-1 to 232-2, the disk adapters 233-1 to 233-4, and the shared memories 234-1 to 234-2 are 
35 connected to a service processor 235 through a communication path 236 inside of a control unit. This service processor 
235 is handled from the outside through a maintenance terminal 250. 

Fig. 1 illustrates a single ECC group (group of disk drives), while Fig. 4 illustrates totally eight ECC groups (group 
of disk drives). The disk drives 242-1 to 242-7, 242-9 to 242-15, 242-17 to 241-23, and 242-25 to 242-31 correspond to 
ECC groups. The disk drives 242-8, 242-16. 242-24 and 242-32 are spared disk drives. So are the disk drives 242-33 
40 to 242-64. 

The description will be oriented to how the microprocessor is operated with respect to the disk control device 20 
arranged as described above when the host 1 0 issues the write data to the microprocessor along the flow shown in Fig. 
5. The write data from the host 10 is transferred to the disk control device 20 through the channel bus 60 and is divided 
into sector lengths by the data divider/reproducer 80. When the host l/F 40 serves to save each division in the slot of 

45 the write plane 1 1 1 of the cache memory, at a step 1000 of Fig. 5, the microprocessor 30 operates to count the number 
of data lengths transferred from the host l/F 40 to the cache memory 1 10 and then save the count value in a data length 
information 122 included in the cache directory 120. Proceeding to a step 1010. the sequential access mode determin- 
ing function 38 is executed to determine if the access specification is sequentially given from the host 10. The presence 
or absence of the sequential access specification is saved in the host access pattern information 1 23 of the cache direc- 

50 tory 120. 

If the sequential access is specified, the operation goes to a step 1 150. At this step, the termination of the writing 
process is reported to the hots 10 without immediately generating the redundant data. Then, for indicating the write data 
is not reflected on the diskdrive 300, a pending flag is raised to the pending flag information 124 on the cache directory 
120 and then the microprocessor 30 is waiting for a specified time. This is because there exists a high probability that 
55 the succeeding write data from the host 10 is issued if the sequential access is specified and that the method of all 
stripes may be used for generating the redundant data by using plural pieces of write data. Hence, the transfer of the 
succeeding data allows the microprocessor to just wait until the stripes (a set of slots to be written on the same position 
of the different disk drives) are made to be the write data. The proper waiting time is calculated by 
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(date length of the stripe -transferred data length)/data transfer speed from the host. After the wait for a soecified 
time, the operation goes to a step 1 020. 

"11 s !? uen . tial access is specified from the host at the step 1010. the operation at the step 1020 is immediately 
executed. The microprocessor 30 operates to calculate the positions of the disk drive 300 where the write data of each 
slot is saved and of the disk medium 350 through the effect of the mapping function 39 for the purpose of deriving the 
column on the disk dnve 300 where the write data is to be written. Next, the microprocessor 30 operates to calculate 
the number of the read processes of the data before update through the effect of the redundant data generating method 
«! f K^ihl ?" I I nU ^ & ° f the re3d P rocesses j s required to generate the redundant data through the effect 
of the method of read and modify. This is calculated as the number of the slots of the write data plus one Next the proc- 

rn^J ?f. rate ! to J ca,c " late _ ,he number of read Presses through the effect of the redundant data generating 
TJ^t mi e 2^ n ?T f This n , number of read P rocesses is required to generate the redundant data through the 
effect of method of all stripes. Th.s is a value calculated by subtracting the number of the disk drives 300 for saving 

US? f i ^ ° f diSK driV6S 300 ,0r ^ th6 data - Then ' the redundant data venerating method 

selecting function 37 is executed to compare the necessary number of reads in the method of all stripes with the nec- 
essary number of reads in the method of read and modify. If the conditional expression of: 

(the necessary number of reads in the method of all stripes) <; 
(the necessary number of reads in the method of read and modify) * 1 * 

is met. the method of all stripes is selected. Then, the operation goes to a step 1 160 

« A»?T^ th0d 01 a " , stripes < second redundant data generating means) is made to be a stream of data shown in Fig 

ft ™^n, nn STl V £^J° diSk drive 300 ,0r a read process ' tnat is - the drive controller 50 

for controlling the disk drives 300 (300d to 300f) for saving data that are not intended for the write process Then the 
data on the d.sk drives 300 (300d to 300f) are read by the read plane 1 12 of the cache memory 1 10 (see (1) of Fig 6) 
The wrrte data saved on the write plane 1 1 1 and the data read on the read plane 1 12 are transferred to the redundant 
data generator 130 (see (2) of Fig. 6). The redundant data generator 130 operates to generate the r^und^ata ^d 
3EL ^J? P 3 ? 1 Jf ° f the C3Che mem ° ry 100 (S6e (3) 0f Fig - 6) - Next ' ,he drive controller control function 36 
t hfn S h w S ° that <he redUndart d3ta ° n th6 read plane 1 1 2 of the cache memory 1 10 is transferred 

5^5^(1^^ 1,16 redUndant d3ta ' ™ S ^ COmP,6ti0n ° f re,,6Ctin9 r6dUndant data ° n the 
In place if the expression 1 is not met. the operation goes to a step 1030. As shown in (1) of Fig. 7 or 8 therequest 
for readmg the wrrte data before update is issued to the disk drives 300 (300a to 300b) for saving the write date and 
then ,s transferred onto the read plane 112 of the cache memory 110. Then, the operation goes to a step 1040 The 

ff»Tc S r e c re ^ renCe ♦ " 33 iS eX6CU,ed l ° ° heCk " the disk drive 300 ( 30 °9) for seeing the redundant data is in use 
If it is in use, the operation goes to a step 1 1 70. 

If the disk drive 300 (300g) for saving the redundant data is not in use. the operation goes to a step 1090 The 
5SS?S 9enerat.ng method selecting function 37 is executed to derive a cylinder number (#) on the disk medium 

Z IZlr i h Trt" ? ^ "^'^ 35 ° the redundant data is saved and ca,culate * data length at each 
turn of he cylmder # (track). In the case of a constant density recording whose information recording capacity per cyl- 
inder of an inner crcumference is different from that of an outer circumference, the data length for one drcumference 

ZZZZZPrSX ^ ^ iS 3 ,UnCti0n ° f the Cylinder # - ln addition - if the information recording 

™ tE! Z 7 2 ' nner c ! rcum,erence is equal to that of the outer circumference, this calculation is not nec- 
45 C ' rCUmferenCe in an * c * inder can De immediately obtained from the device specification of 

Next . the microprocessor 30 operates to compare the data length per circumference with the data length of the 
write data saved in the cache directory 1 20 and determine if the spinning on standby takes place. In actual, an overhead 
fmfnf hl 6 H r r ? undant data inside ° f ,he mic ro Program. Assuming that the overhead is 1/n of one spin 

22 t* r d ^ k , me f ' um 3S0 - S,nce the write date le "9th is required to be within one spin time containing the overhead 
so time, the condition for comparison is: 

Write Data Length < (Date Length per circumference of Disk Medium) x (1 - 1/n) (expression 2) 

When this condition is met. the redundant date generating method selecting function 37 is executed to select the 
method for generating the redundant data through the effect of the method of a generation in a drive (third means for 
generating redundant data). Then, the operation goes to a step 1 1 00. At a step 1090 shown in Fig. 5. it s assumed that 
an overhead (1/n) for operating the redundant data inside of the micro program is 1/4 assumea mat 

The operation in this assumption is executed as shown in Fig. 7. The write data on the write plane 1 1 1 and the data 
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before update on the read plane 1 1 2 are transferred to the disk drive 300 (300g) (see (2) of Fig. 7). The write data trans- 
ferred from the disk control device 20 and the data before update are transferred to a drive buffer 330 through a disk l/F 
31 0. In synchronous with it. the microprocessor 320 located inside of the disk drive 300 operates to position the head 
at the posifton where the redundant data before update is saved through the disk control mechanism 340 On the ter- 
5 mination of the positioning operation, the redundant data before update is read into the drive buffer 330 (see (3) of Fio 
7). The exclusive OR of the redundant data before update to be read, the write data having been stored in the buffer 
and the data before update is calculated by the microprocessor 320 for generating the new redundant data and storing 
it ,n the dnve buffer 330 (see (4) of Fig. 7). Next, when the disk medium 350 is spinned once and reaches the positioning 
point, the microprocessor 320 operates to write the redundant data on the drive buffer 330 (see (5) of Fig 7) The disk 

10 ST?,;! 00 r ! P ° rtS 3 S ^ CeSS ° f failure 0< the 9 eneration ar, d the write of the redundant data to the disk control device 
20. When the wrrte of the redundant data becomes successful, the operation goes to the next step 1 1 10 The micro- 
processor 30 operates to detect if the update of the redundant data becomes successful or failed. If it is successful the 
operation goes to a step 1 120. me 

When the update of the redundant data is failed, the operation goes to a step 1 1 70. At this step, for the retry proo- 
fs ess to be executed if the update of the redundant data is failed or the redundant data is in use. the redundant data gen- 
erating method selecting function 37 is executed to select the method of read and modify. 

The method of read and modify (first means for generating the redundant data) is executed as shown in Fig 8 That 
is. since the data before update has been read into the read plane 1 12 of the cache memory 110. only the redundant 
data before update on the disk drive 300 (300g) for saving the redundant data is read onto the read plane 1 1 2 of the 

20 cache memory 110 (see (2) of Fig. 8). Then, the write data, the data before update, and the redundant data before 
update are transferred into the redundant data generator 130 (see (3) of Fig. 8) for generating the redundant data from 
hose pieces of data and then saving the generated redundant data on the write plane 1 1 1 of the cache memory 110 
(see (4) of F.g 8). Next, the generated redundant data is written in the disk drive 300 (300g) for saving the redundant 
data (see (5) of Fig. 8). This is the completion of updating the redundant data. 

25 ,u 1 neith6r 01 the conditional expressions (1 ) and (2) are met. the operation goes to a step 1 1 40 At this step 

the redundant data generating method selecting function 37 is executed to select the method of difference (fourth 
means for generating the redundant data) as the method for generating the redundant data. As shown in Fig 9 the flow 
of the process ,s as shown in Fig. 9. The difference data generator control function 35 is executed to transfer the write 
data on the wnte plane 1 11. the data before on the read plane 112. and the difference data generator 140 (see (2) of 

30 Fig. 9) for generating the difference data and saving it on the read plane 1 1 2 (see (3) of Fig. 9) of the cache memory 

Assuming that the redundant data is the exclusive ORed data, for example, the difference data, termed herein 
means the data g.ven by taking an exclusive OR of the data A and B on the write plane 1 1 1 and the previous data a and 
b on the corresponding read plane 1 12 to the write plane 1 1 1 . all of which are shown in Fig 9 
* Next, the drive controller control function 36 is executed to transfer the difference data on the read plane 1 1 2 of the 

r^ ^Tp^aV °tL° diSR dfiVe 300 (30 ° 9) for S3Vin9 the redundan t data through the use of the drive controller 50 
(see (4) of Fig. 9) The difference data transferred to the disk control device 20 is then transferred to the drive buffer 330 
hrough the i d.sk l/F 310. In synchronous with it, the microprocessor 320 located inside of the disk drive 300 operates 
to pos.t.on the head at the position where the redundant data before update is saved through the disk control mecha- 
k S iT jJTT . P° sitionin9 oPerat'O" is terminated, the microprocessor 320 operates to read the redundant data 
before update (see (5) of Fig. 9) and calculate an exclusive OR of the redundant data before update to be read and the 
difference data saved in the drive buffer 330 for generating the redundant data (see (6) of Fig. 9) and saving it in the 
drive buffer 330 Next, when the disk medium 350 spins once and reaches the positioning point, the microprocessor 320 
operates to wnte the redundant data saved in the drive buffer 330 on the disk medium 350. If the write of the redundant 
. data becomes successful, the disk drive reports the normal termination to the disk control device 20 

On the other hand, the microprocessor 30 located inside of the disk control device 20 operates to recognize that 
the generation of the redundant data becomes successful through the drive controller. Next, the operation goes to a 

orations arrtermfnated Wf '' te ^ '* ** ^ 3 °°' Thea proceeding ,0 a ste P 1 1 30 - a se "'es of writing 

♦ „ JTf diS , k Tl* dWiCe 3nd tHe meth0d for controllin 9 ^e disk array device according to this embodiment are con- 
trol edto select the most approximate method for generating the redundant data from the method of all stripes and the 
method of read and modify, both of which are executed to generate the redundant data on the disk control device 20 
according to the length of the write data received from the host 10. the access pattern such as sequential accesses 
whether or no the redundant data is in use (that is. the magnitude of load) in the disk drive 300. and whether or not a 
failure takes place ,n the process of generating and saving the redundant data, as well as the method of a generation 
. , a drive and the method of difference, both of which are executed to generate the redundant data on the disk drive 
300 Hence, the disk array dev.ce and the control method thereof make it possible to reduce an overhead accompanied 
with the generafon of the redundant data in processing the write data, improve a response time and a throughput of the 
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disk array device in the process for writing the data, and enhance the reliability of generation of the redundant data 



« Claims 

1 • A disk array device comprising: 



a plurality of disk drives (300) for saving data to be sent to a host system- 

tern a °/ ff^' 8 ^^ 65 ? mP ° Sin9 8 SpeCifiC data 9rou P for savin 9 data received from said host sys- 
tem and at least one p,ece of redundant data generated from the write data- Y 

a d.sk control device (20) for controlling data transfer between said host system and said disk drive i300V 

a pluralrty of c,rcu,ts (130) for generating said redundant data in the different methods from each oth^ 2nd 

LtS 

sad host system, an access mode specified by said host system, and a using status of SSSJK^ 
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3. 
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4. A disk array device comprising: 



a plurality of disk drives (300) for saving data to be sent to a host system; 

a subset of said disk drives (300) composing a specific data group for saving write data received from said host 
system and at least one piece of redundant data generated from said write data, each of said disk drives (3001 
containing a redundant data generating circuit for generating the redundant data from the data received from 
said host system and the data read from said disk; 

a disk control device (20) for controlling data transfer between said host system and said disk drive (300V 
said disk control device (20) having a redundant data generating circuit (130) and a selective control circuit'(30) 
for selecting said redundant data generating circuit inside of said drive (300) or said redundant data generating 
circuit inside of said disk control device (20); and 

S3 Hrvfr^ 0 T , .H drCUit (3 , 0) SerVi " 9 ,0 SeleCt the USe 0f Said redun dant data generating circuit inside of 
sa d drive (300) only, the use of said redundant data generating circuits inside of said drive (300) and said con- 
trol device (20). or the user of said redundant data generating circuit inside of said control device (20) only. 

5 " ™ I? arra/ d > ViCe a t Claimed in c,aim 4 - wherein said disk control device (20) selects said first redundant data 

fZ fS 0 '?" J* $eCOnd redundant data generating circuit based on a length of said write data received 
irom said nost system. 

6 * r^f ?!? deV ' Ce 35 C ' aimed in ° laim 4 " Wherein said disk contro1 dwice < 20 > include s a Polity of said second 
redundant data generating circuits for generating said redundant data in different methods from each other. 

7. A disk array device comprising: 

a plurality of disk drives (300) for saving data to be sent to a host system- 

f^!?, 0 ' Sak, -f f driV6S (3 ° 0) com P° sin 9 a specific data group for saving divided parts of write data 
received from said host system and at least one piece of redundant data generated from the divided part of 
said write data in a distributed manner; ^ 

^mthl hJ? diSk dr i e l (3 °u ) haVi " 9 3 redundant data generating circuit for generating the redundant data 
from the data received thereby and the data read out of a disk; 

a disk control device (20) for controlling data transfer between 'said host system and said disk drive (300) 

said disk control device (20) including; 
a first redundant data generating circuit inside of said control device (20) itself for generating new redundant 
data from saic I d.v,ded wrrte data received from said host system, previous data to be updated by said divided 
write data, and redundant data of said divided write data about said data group- 

a second redundant data generating circuit inside of said control device (20) for generating new redundant data 
from said divided write data received from said host system and data disabled to be updated by said divided 
write data about said data group; 

a difference data generating circuit (140) for generating difference data from said divided write data received 
from said host system and the previous data to be updated by said divided write data- and 
a selective control circuit (30) for selecting a circuit for generating said redundant data"- and 

wnerem said selective control circuit operates to select for generating said redundant data the use of 
said first redundant data generating circuit inside of said control device (20). the use of said second redundant 
5? > ? C,f f the mSth0d ° f 9 eneratin 9 difference data generated by said difference data generating 
circurt (140). transferrmg said difference data to said disk drive (300) for saving the redundant data of said 

JjSlr?. J? ° U l Said d3ta 9f ° UP ' 3nd 9 enerati "9 new redundant data from said difference data and 
Ifn ? ?J , . S3,d redundant data generating circuit inside of said drive, or the method of transferring 
said divided write data received from said host system and the previous data to be updated by said divided 
wrrte data to said disk drive (300) for saving the redundant data of said divided write data about said data 

^fh cL 3 , J 6 "!' 3 ' 7 ^ redundant da,a ,rom said w rrte data, said previous data and said redundant data 
with said redundant data generating circuit inside of said drive (300). 

. The disk array device as claimed in claim 7, wherein 

said selective control circuit (30) selects: 

!!llc e c CO rl SeCOnd redundant data generating means inside of said control device (20) if the conditional 
expression or 
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b + 1 £ 2* a + 1 (a is a divisional number of said write data, b is a number of said disk drives (300) 
used except for saving the redundant data included in said data group) 

is met; 

said first redundant data generating means inside of said control device (20) if the drive for saving said redun- 
dant data is in use; 

transferring said divided write data and the previous data to be updated by said divided write data to said disk 
drive for saving the redundant data of said write data about said data group and generating new redundant 
data from said write data and said previous data and said redundant data with said redundant data generating 
circuit inside of said drive (300) if said write data length is equal to or shorter than a given length; and 
generating difference data generated by said difference data generating circuit, transferring said difference 
data to said disk drive for saving the redundant data of said divided write data about said data group, and gen- 
erating new redundant data from said difference data and redundant data with said redundant data generating 
circuit inside of said drive (300). 



The disk array device as claimed in claim 7, wherein the selective control circuit of said disk control device (20) 
selects such a redundant data generating circuit as reducing the processing time of said write data containing gen- 
eration of the redundant data to a minimum. 
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FIG. 2 
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FIG. 5 
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FIG. 10 
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